Topic modeling for untargeted substructure exploration in metabolomics.

نویسندگان

  • Justin Johan Jozias van der Hooft
  • Joe Wandy
  • Michael P Barrett
  • Karl E V Burgess
  • Simon Rogers
چکیده

The potential of untargeted metabolomics to answer important questions across the life sciences is hindered because of a paucity of computational tools that enable extraction of key biochemically relevant information. Available tools focus on using mass spectrometry fragmentation spectra to identify molecules whose behavior suggests they are relevant to the system under study. Unfortunately, fragmentation spectra cannot identify molecules in isolation but require authentic standards or databases of known fragmented molecules. Fragmentation spectra are, however, replete with information pertaining to the biochemical processes present, much of which is currently neglected. Here, we present an analytical workflow that exploits all fragmentation data from a given experiment to extract biochemically relevant features in an unsupervised manner. We demonstrate that an algorithm originally used for text mining, latent Dirichlet allocation, can be adapted to handle metabolomics datasets. Our approach extracts biochemically relevant molecular substructures ("Mass2Motifs") from spectra as sets of co-occurring molecular fragments and neutral losses. The analysis allows us to isolate molecular substructures, whose presence allows molecules to be grouped based on shared substructures regardless of classical spectral similarity. These substructures, in turn, support putative de novo structural annotation of molecules. Combining this spectral connectivity to orthogonal correlations (e.g., common abundance changes under system perturbation) significantly enhances our ability to provide mechanistic explanations for biological behavior.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Conversation on Data Mining Strategies in LC-MS Untargeted Metabolomics: Pre-Processing and Pre-Treatment Steps

Untargeted metabolomic studies generate information-rich, high-dimensional, and complex datasets that remain challenging to handle and fully exploit. Despite the remarkable progress in the development of tools and algorithms, the "exhaustive" extraction of information from these metabolomic datasets is still a non-trivial undertaking. A conversation on data mining strategies for a maximal infor...

متن کامل

Untargeted metabolomics suffers from incomplete data analysis

Introduction: Untargeted metabolomics is a powerful tool for biological discoveries. Significant advances in computational approaches to analyzing the complex raw data have been made, yet it is not clear how exhaustive and reliable are the data analysis results. Objectives: Assessment of the quality of data analysis results in untargeted metabolomics. Methods: Five published untargeted metabolo...

متن کامل

Amino Acid Metabolism is Altered in Adolescents with Nonalcoholic Fatty Liver Disease-An Untargeted, High Resolution Metabolomics Study.

OBJECTIVE To conduct an untargeted, high resolution exploration of metabolic pathways that was altered in association with hepatic steatosis in adolescents. STUDY DESIGN This prospective, case-control study included 39 Hispanic-American, obese adolescents aged 11-17 years evaluated for hepatic steatosis using magnetic resonance spectroscopy. Of these 39 individuals, 30 had hepatic steatosis ≥...

متن کامل

Structured plant metabolomics for the simultaneous exploration of multiple factors

Multiple factors act simultaneously on plants to establish complex interaction networks involving nutrients, elicitors and metabolites. Metabolomics offers a better understanding of complex biological systems, but evaluating the simultaneous impact of different parameters on metabolic pathways that have many components is a challenging task. We therefore developed a novel approach that combines...

متن کامل

Automated LC-HRMS(/MS) Approach for the Annotation of Fragment Ions Derived from Stable Isotope Labeling-Assisted Untargeted Metabolomics

Structure elucidation of biological compounds is still a major bottleneck of untargeted LC-HRMS approaches in metabolomics research. The aim of the present study was to combine stable isotope labeling and tandem mass spectrometry for the automated interpretation of the elemental composition of fragment ions and thereby facilitate the structural characterization of metabolites. The software tool...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Proceedings of the National Academy of Sciences of the United States of America

دوره 113 48  شماره 

صفحات  -

تاریخ انتشار 2016